0 74/015^8(1) 
1^N15/11- 



L PROPERTY ORGANIZATION 
iiional Burca'j 




(51) Intemationai Patent aassification 5 
C12N 15/11, C12P 21/08 
C12Q 1/68, C12N 15/62 
C07K 15/28 



CI2N15/11 JNDERTHE PATENT COOPERAVION TREATY (PCT) 



A2 



(ll)IntematioDal Publication Number: 



WO 94/01548 



(43) Intemationai Publication Date; 20 January 1994 (20.01.94) I 



(21) International Application Number: PCT/GB93/0I467 

(22) International Rling Date : 1 3 July 1 993 ( 1 3.07.93) 

(30) Priority data: 

9214857.6 



13 July 1992 (13.07.92) 



GB 



(71) Applicant (for all designated States except US): MEDICAL 

RESEARCH COUNCIL (GB/GBJ; 20 Park Crescent 
London WIN 4AL (GB). 

(72) Inventors; and 

(75) InventoR/Applicants (for US only) : SIBSON, David, Ross 
[OB/GB]; 37 Gnmsdelis Lane, Amersham, Buckingh- 
amshire HP6 6HF (GB). GROSS, Jacqueline [GB/GBJ; 
47 Boxmoor Road, Kenton, Middlesex HA3 8LH (GB) 
HADFIELD, Kathryn, Mary (GB/GBJ; 5 Carlisle Ter- 
race, St Ives, Huntingdon, Cambridgeshire PE17 4PO 
(GB). HOWELLS, David [GB/GBJ; 77 Puttocks Drive, 
Welham Green, Hatfield, Hertfordshire AL9 7LW(GB) 
STARKEY, Michael [GB/GBl: 27 Creasv Close, Abbots 
/:t"Sley, Hertfordshire W05 OHS (GB). KELLY, Maria 
^^^^^^ Road, Ealing, London W5 3ST 
(GB). SHAW, Diana [GB/CA]; 342 Glacier Hall. Uni- 
versity of Calgary, 2500 University Drive NW, Calcarv 
Alberta T2N !N4 (CA). 



(74) Agent: BIZLEY, Richard, Edward; Hepworth Uwrence 
Bryer & Bizley, 2nd Floor Gate House South, We<^t Gate Har- 
low, Essex CM20 UN (GB). 

(81) Designated States: AT, AU, BB, BG, BR, BY CA CH 

9h -^P' ^^p> KR, Kz; lk: : 

c^^' PT^. Ro, rG; 

European patent (AT, BE, ' 
CH, DE, DK, ES, FR, GB, GR, IE, IT, LU, MC NL 

^^x9t ^.^r^P??''"^ CM, GA, 

GN, ML, MR, NE, SN, TD, TG). 

Published 

Without intemationai search report and to be republished 
upon receipt of that report. 



(S4) Title: HUMAN^NUCL^^ ACID FRAGMEhOS. ISOLATED FROM BRAIN ADRENAL TISSUE. PLACENTA OR 



BONE NARROW 
(57) Abstract 



wo 94/01548 PCr/GB93/01467 

2 

other and to known sequences in public databases, that each of the 
cDNAs picked at randomt. could be unambiguously classified. The cDNAs* 
could be classified as being either entirely new or as corresponding, 
to a greater or lesser extent, to a previously known secjuence. cDNAs 
5 identified in this way were further characterised and found to be 
useful in a variety of standard applications, including physical 
mapping. Unfortunately, such a process is insufficient. The longer 
the process is pursued with any given population of cDNAs the less 
efficient it becomes and the lower the rate of identification of new 
10 clones. In essence, as the number of cDNAs which have already been 
picked rises, the probability of picking a particular cDNA more than 
once increases. This difficulty is exacerbated by the wide range of 
abundancies at which different cDNAs can occur, which abundancies can 
vary by several orders of magnitude. Thus, whereas some sequences are 
15 exceedingly rare, a single cDNA type may comprise as much as 10% of the 
population of cDNAs produced from a particular tissue (Lewin, B. Gene 
Expression, Vol. 2: Eukaryotic Chromosomes, 2nd ed., pp. 708-719. New 
York: Wiley, 1980). The need to avoid missing rarer species in any 
given population presents a considerable problem. 

20 

Various approaches (so-called ••normalisation" techniques) have been 
tried in addressing the problem of increasing the efficiency of 
examination of a mixed nucleotide population, for example, such a 
population as is to be examined in human genome sequencing. 

25 

Thus, a standard PGR protocol can be used to amplify selectively cDNAs 
which are present at extremely low levels, if th^re is information 
abc " the sequence of those cDNAs. If not, a primer specific to tne 
des -red cDNA cannot be constructed and the desired cDNA cannot be 
30 selectively amplified. The standard PGR method is therefore inadequate 
if it is desired to characterise a number of unknown genes. 




A second approach involves hybridization of cDNA to genomic DNA. At 
saturation, the cDNAs recovered from genomic/cDNA hybrids will be 

35 present in the same abundance as the genes encoding them. This will 
provide a much more homogenous population than the original cDNA 
library, but does not entirely solve the problem. In order to reach 
saturation in respect of the very rare sequences, it will be necessary 
to use huge quantities of cDNA, which need to be allowed to anneal to 

40 large amounts of genomic DNA over a considerable periods of time. 
Furthermore, cDNAs which are homologous to genes which are present in 
multiple copies in the genome will be over-represented. 



OIL. 



PCr/GB93/0146"; 

WO 94/01548 



HUMAN NUCLEIC ACID FRAGMENTS. ISOLATED FROM BRAIN 
ADRENAL TISSUE. PLACENTA OR BONE NARROW 

This invention relates to new nucleic acid fragments encoding gene 
p Lets oTportxcns thereof, which fragments are obtainable from human 
LIuIc acid populations, individual members of such populations be.ng 
present in widely varying amounts. 

Situations are increasingly arising in which it is necessary to study 
5 complex nucleic acid or polynucleotide populations. For example, .t xs 
now widely appreciated that an invaluable resource could be created .f 
the entire sequence of the genomes of organisms such as man were 
determined and the information available. The magnitude of such a task 
Should not, however, be underestimated. Thus, the human genome may 
10 contain as many as 100,000 genes (a very substantial proportion of 
which may be expressed in the human brain (Sutcliffe, Ann. Rev 
Neurosci. 11:157 (1988))]. Only a very small percentage of the stock 
of human genes has presently been explored, and this largely m a 
piecemeal and usually specifically targeted fashxon. 

There has been much publ.c debate about the best means of approaching 
human genome sequencing. Brenner has argued (CIBA Foundation Symposium 
149-6 (1990)) that efforts should be concentrated on cDNAs produced 
from reverse transcribed mRNAs rather than on genomic DNA. This is 
20 primarily because most useful genetic information resides in the 
fraction of the genome which corresponds to mRNA. and this fraction is 
a very small part of the total (5% or less). Moreover, techniques for 
generating cDNAs are also well known. On the other hand, even 
supposing near perfect recovery of cDNAs corresponding ro all -P""^^ 
25 mRNAs. some potentially useful information will be lost by the cDNA 
approach, including sequences responsible for control and regulation o. 
genes. Nonetheless, the cDNA approach at least substantially reduces 
Che inherent inefficiencies resulting from analysis of repeated 
sequences or non-coding sequences in an approach which depends upon 
30 genomic DNA sequencing. 

Recently. the results of a rapid method for identifying and 
characterising new cDNAs has been reported (Adams, M.D. et al.. Science 
252 19C1, pp 1651-1656). Essentially, a semi-automated sequence 
35 "^der was used to produce a single read of sequence from one end of 
each of a number of cDNAs picked at random. It was shown, by comparing 
the nucleic acid sequences of the cDNAs (or the protein sequences 
produced by translating the nucleic acid sequence of the cDNAs) to each 



wo 94/01548 



PCT/GB93/0146" 



3 

A third approach exploits the second order reassociation kinetics of 
cDNA annealing to itself. After a long period of annealing, the cDNAs 
which remain single stranded will have nearly the same abundance, and 
can be recovered by standard PGR (see Patanjali, S.R. et al-, PNAS USA 
5 88/ 1991, pp. 1943-1947; Ko, M.S.H., NAR 19, no. 18, 1991, pp 5705- 
5711). The methods disclosed in these two publications, however, 
suffer from notable disadvantages. They are entirely dependent on the 
stringent physical separation of single stranded and double stranded 
DNA, require an elevated number of manual manipulations in each 
10 reaction, and necessitate protracted hybridisation times (up to 288 
hours in the method of Patanjali et al, ) 

yet a further approach in "normalising" a nucleotide population is 
described in co-pending British Patent Application No 91 15407.0, filed 

15 17th July, 1991 by MRC, and involves a PGR process in which a mixture 
comprising a heterogenous DNA population and appropriate 
oligonucleotide primers is first formed and the DNA denatured, bur 
before effecting a conventional PGR protocol the conditions are altered 
to allow the denatured strands of the more common DNA species to 

20 reanneal together, whilst avoiding annealing of primers to the DNA 
strands- By this means, rarer species can subsequently be amplified in 
preference to the more common species. 

This PGR normalisation method in general comprises the steps of: 

25 

(a) preparing a mixture comprising a heterogenous DNA population and 
oligonucleotide primers suitable for use in a PGR process, in which the 
DNA is denatured; 

30 (b) altering the conditions to allow the denatured strands of the 
more common DNA species to reanneal, while preventing the annealing to 
the primers to the DNA strands; 

(c) further altering the conditions of the mixture in order to allow 
35 the primers to anneal to the remaining single-stranded DNA comprising 

the rarer DNA species; and 

(d) carrying out an extension synthesis in the mixture produced in 
step (c) . 

40 

Advantageously, the method consists of a cyclic application of the 
above four steps. 



wo 94/01548 PCT/GB93/0146: 



4 

It will be appreciated that - the conditions may be altered by the 
alteration of the tgmperature -^f the reaction mixture. However, any 
conditions which affect the hybridisation of complementary DNA strands 
to one another may be varied to achieve the required result. 

5 

Because the reannealing efficiency of any given DNA species will depend 
on th:' oroduct of its concentration and time, the more abundant the 
sequence the greater the extent to which it will reanneal in any given 
time period. Once a DNA species has reached a certain threshold 

10 concentration it will no longer be eunplified exponentially, as a 
significant amount will have annealed to the double stranded forr^ 
before the priming step. Thus, as each individual DNA species is 
amplified by the process to its threshold concentration, the rate of 
amplification of that species will start to tail off. Eventually, 

15 therefore, all DNA species will be present at the same concentration. 

The length of the reannealing step will determine how much DNA is 

present at the threshold concentration. Preferably, therefore, the 

duration of the reannealing step will be determined empirically for 
20 each DNA population. 

In the PGR normalisation process in general, the DNA primers may be 
adapted to prime selectively a sample of the total DNA population. By 
using primers which will only prime a sample of the population, only 

25 that sample will be amplified and normalised. The total quantity of 
DNA generated will thereby be reduced, which means that the cycling 
times can be kept low. This ensures that the method is applicable to 
complex DNA populations such as cDNA populations. In addition, a first 
primer can be used which is adapted selectively to prime a sample of 

30 the total cDNA population, and a second primer which is a general 
primer. Advantageously, the general primer is oligo dT (each primed 
cDNA will then be replicated in its entirely, as the oligo dT primer 
will anneal to the poly-A tail at the end of the cDNA) . 

35 In co-pending British Patent Application No 92 14873.3, filed by MRC 
13th July, 1992, a new process is described which allows the study and 
identification of the individual members of a mixed or heterogenous 
population of nucleotide sequences perhaps of varying abundance. In 
preferred embodiments of the said process, the starting nucleic acid 

40 population is treated by: 

<a) ' subjecting the nucleic acid to the action of a reagent, 



PCT/GB93/0146" 

WO 94/01548 



preferably an endonuclease which has its cleavage and recognition sites 
separated, which reagent cleaves the nucleic acid so as ^° P"^"" 
double stranded cleavage products the individual strands of which 
overlap at cleaved ends to leave a single strand extending to a known 
5 extent; 

(b) ligating the cleavage prod-.cts from (a) with a population of 
adaptor molecules to generate adaptored cleavage products, each of 
which adaptor molecules has a cleavage product end recognition sequence 

10 and the population thereof encompassing a range of adaptor molecules 
having recognition sequ..nce. complementary to a predetermined subset of 
the sequences of the cleavage-generated extending single strands; and 

(c) selecting and separating only those adaptored cleavage products 
15 resulting from (b) which carry an adaptor of predetermined recognition 

sequence . 

A preferred endonuclease for use in step (a) of the above process is 



20 



Fok 1. 



An important feacure of this process is the use of adaptor molecules. 
The adaptors used must have "overhanging" fragment recognition 
sequences which reflect or are complementary to the ejctending cleavage- 
derived sequences which the adaptors are designed to react with. It is 
25 also preferred that the adaptors used should end with a 5' hydroxyl 
group. The avoidance of a 5' phosphate group removes the risk of 
inappropriate ligation involving the adaptors. 

Adaptor molecules may also contain a portion permitting specific 
30 sequence selection and separation (as in step (c) of the process) when 
a sequence is attached to the adaptor. For example, an adaptor can. 
carry biotin, thereby permitting advantage to be taken of the 
biotin/avidin 

reaction in selecting and separating desired adaptored molecules. 
35 Additionally, adaptors preferably comprise a known and selected 
sequence such that specifically isolated adaptored molecules can be 
amplified by known techniques (such as PGR) using a primer 
complementary to the core sequence. 

40 Preferably the adaptors are short double-stranded ..pligonucleotides 
which can be joined to the ends of cleavage products. They will have 
been chemically synthesised so that their sequence can be predetermined 



NVO 94/01548 



PCr/GB93/0l46- 



10 



15 



20 



25 



30 



35 



40 



and so that large concentrations can be easily produced. They may als-^ 
be Chemically modified in a way which allows them to be easily purified 
durxng the process. As mentioned above, ideally their 5' ends will be 
unphosphorylated so that once joined to fragments the adaptored end o^ 
the latter will no longer be able to participate in further ligation 
reactions. youxun 

It is preferred that the adaptor cleavage product end recognition 
sequences are on the end of the longest Oligonucleotide strand 
making up the preferred adaptor molecules, are at least 3 nucleotides 

in length and with totally random bases at ^ho «• , 

^ , anaom oases at the single-stranded 

pos.t.on(s, two nucleotides in from the end. This then allows 

selection to be performed both during the joining reaction and during 
subsequent priming reactions. Then, because the final degree of 
selection is a result of the product of the degrees of selection 
ach.eved at these two stages, maximum selection .an be achieved ,Z 
adaptor/primer available. 

TsT7 - the 5. end of the longest oligonucleotide 

also facilitate the use of modified oligonucleotides for sep. 



ion 



Ill s'": "Ribonucleotide will .nodi: , „ 

"=5 end. ^""h" the double benefit Of requiring just one ™,od.£ied 

:it:tt' 'r- — -t-sio„, 

"e;fe : : r^":"' --"i""- « » position w^ere it cannot 
interfere with ligation or subsequent primin, reactions. 

Although pnli, one type of adaptor is required per ligation reaction it 

eubset Of sequences be present, because then the opportunitv for 

regents in the chosen subset to lig.te to each other'i, ^iniJised 
It is.aiso preferred that the chosen specific adaptor, carrying a 
t e o":™ r =Hould not only L different' fL: 

the other adaptors in its single-stranded extension, but also different 

introd 7\ orientation to be 

fac litate it ■ ' """'''^ oligonucleotide to 

facilitate it. separation with the cleaving products to which it Joins. 

subse'ts'^f"'""'""'"'" """" " oategories or 

Ty lL L":r'"r ""^ °' ^-'^"^ =P"i"c in so.e 

If seou. " "»""i"' " in "ep ,c,. i„ this way subsets 

Of aequence. can be provided depending upon the specific adao"; 



wo 94/0 1 54o 



PCT/GB93/IU46" 



10 



20 



25 



30 



. a 'or use m subsequen. nucleoc.de sequencing. Th.s 
chosen, e.g .or identification of a large population of 

facilitates, for example, the identi splitting such 

..^ir^r, a rational approach to ■ spiiti-j-ny 

... s... ...... 



turn . 



. nucl.ir acid fr.gr,.nt .ncoding a g.n. product cr porr.on 
comprising any one of:- 

(a, a sequence selected from SEQ io Nos 1 to 1193; 
,b) an allelic variation of a sequence as defined in (a); 
15 (c) a sequence complementary to (a) or (b). 

o„. i. any on. of SEQ ID No. 1 to 1193, or a co»pl.».nt " 
Tarnation tn.r.of. Pr.f.rr.d s.g-.nc.s .x.i.it no «,or. th. 
homology to a human s.qu.nc. Known p.r ... 

X„ a further .sp.ot, the invention provides a nucleic 
comprising a portion of a seguenc. as defined a.ove of - ^ ^ 

=u=h that a probe of the same site and exhib..txng compl.«.ntarity t 
such that a proo. .,.,.r.blv. such portions 



:::: pri:n cr:y::i:.se to said sequence, --erably such^po^t..^^^^ 
!re at least 15 bases in length. It will be appreciated that m.nor 
atches" the aforesaid ..complementarity are not excluded provided 
..smatches xn the a hybridisation conditions 

rrr:::::r:ero^^ti :i s..... person, but 

, • ro the following: Melting temperature of hybrids 
for example, to the loiiowii.y , ^ cr-i 48 d1390 

, ^w.. n .7 Proc. Natl. Acad. Sci, 48 piJ^ 



E T and McCarthy, B. J. Proc. Natl. 

^.ff:ct of formamld. on lowering mel.ng tem^a... - cas.y, 
and oavidson, ... »ucle.c Acids Kes 4 P^""^" 
imperfect homology - Bonner, T. I. .t al., J. "oi 
3S a' 3,. General - K.in.oth, and Wahl. =. Anal. Bioch.m. X38 p26 
ZZ\ . OUgo hybriditation and -ashing - Lathe, K. Hoi. Biol. 133. 
PI (1985). 

.h. present invention also envisages OX. constructs '^^^'^''^ 
40 TragmLts or seguences a, referred to above with a control 

regulatory sequence. 



wo 94/01548 



PCr/GB93/0146': 



8 



10 



15 



The invention includes such DNA constructs us.ng a gene system known xn 
the art ligated to a sequence or fragment of the invention so as to 
enable, upon expression, the provision of a fusion polypeptide. 
Preferably, an endopeptidase recognition site is provided such that 
when the sequence or fragment is expressed it is expressed in frame 
with a known protein with the boundary being a cleavage site for an 
endo-ptidase with a rare cutting site. The known protein can then be 
affinity purified, and the peptide corresponding to the fragment or 
sequence in accordance with the invention may be released by the 
endopeptidase. Alternatively, the whole protein can be used to raise 
antibodies which can then be screened for those directed at polypeptide 
corresponding to the fragment or sequence of the invention. 

Since the present fragments and sequences can be used to produce, inter 
alia, corresponding genes, whether by isolating them, by synthesis or 
otherwise, such use and the resulting DNA fragments comprising genes 
are further aspects of the invention. 



20 



25 



30 



35 



40 



Yet another aspect of the invention is an expression vector comprising 
a fragment, sequence, gene-comprising DNA fragment, or DNA construct, 
as above, positioned such that that nucleic acid sequence which 
encodes the polypeptide corresponding to said fragment, sequence or DNA 
fragment is in operable reading frame with a control or regulatory 
sequence. 

Other aspects of the invention are host cells incorporating a sequence, 
or fragment, or gene-comprising 2NA fragment, or DNA construct, as 
above, as a heterologous part of the expressible genetic information of 
the cell. The production of such modified host cells can be achieved 
using methods known in the art. Such modified host cells can be used 
to express corresponding proteins, and these materials lend themselves 
in turn to the preparation of corresponding monoclonal or polyclonal 
antibodies using standard techniques. 

Also included in this invention are such antibodies. Reference can be 
made, inter alia, to the following literature: Monoclonal antibodies, 
Cambell, A. M. Laboratory Techniques in Biochemistry, and Molecular 
Biolcgy Ed. Burdon, R. H. i van Knippenberg, P. H. vol 13. Elsevier 
Amsterdam 1984. Coding, J. Monoclonal antibodies: Principles and 

Practice, 2nd Edition, acade c Press, London 1986. Kipps, T. J. ana 
Herzenberg, L. A., Handbook of Experimental Immunology : Applications 
of immunological methods in biomedical sciences, 4th edition Ed. Weir, 




wo 94/01548 PCT/GB93/0146" 

9 

D, M. et al., pl08 Blackweil scientific Pubiicarions , Oxford. Harlow, 
E and Lane, D. Antibodies, A Laboratory Manual, Cold spring harbor 
Laboratory, Cold Spring, New York. 

5 Expression in an appropriate higher eucaryotic host may be important to 
ensure correct protein folding and also activity. Expression to avoid 
copurif ication of toxic products can sometimes be better performed in 
organisms approved for human consumption, eg prokaryotic Bacillus 
subtilis, eurkaryotic yeast, mammalian cows milk vectors, and other 
10 methods known in the art. 

The invention also includes novel gene products or portions thereof 
encoded by a fragment, sequence or gene-comprising DNA fragment of the 
invention. 

15 

It will be appreciated that the sequences of the present invention 
collectively have utility based, inter alia, upon their common origin, 
and hence they can effectively be considered together rather than as 
separate entities. It is convenient to represent them as separate 

20 sequences, because this is how they were produced and serves as 
"punctuation" between the different functional entities which each 
sequence represents. However, the sequences could just as easily have 
been presented as a continuous sequence derived by placing them end to 
end in the order in which they were produced, with a separate 

25 indication of where the beginnings and ends of the component sequences 
are. 

In contrast to investigations hitherto, where gene fragments (sequence 
fragments) could only be identified through some known characteristic 

30 [for example: their homology to a fragment which largely encodes amino 
acids identified by sequencing a previously isolated peptide or is the 
antisense of that coding sequence; or them having at least partial 
homology to previously characterised nucleic acids; or them having 
ability to encode expressed proteins which could later be detected by 

35 functional assays of the cells expressing those proteins or by using 
antibodies which had been previously raised against the proteins to 
detect their expression, Sambrook J., et al.. Molecular Cloning CSH 
Press 1989), the sequences and fragments described by the present 
invention are entirely underivable and unpredictable from the prior 

40 art, but are nonetheless clearly of great value for various purposes. 

Thus, such sequences, by comparing them to sequence databases, can be 



wo 94/01548 PCT/GB93/0146- 



10 

used as a means for determining the existence of new members of 
existing gene families, new human genes when previously only non-human 
genes were known, and new., genes when previously no genes were known 
{Karlin, S. and Altschul, S. F. Proc. Natl. Acad. Sci. 87 p2264-2268 
5 (1980)), In all cases, this allows the isolation of the corresponding 
genes and their products, and hence enables the manufacture of 
molecules of potential biological interest by recombinant means. 
Screening libraries of known materials or hitherto unexplored source 
materials for biological efficacy is now an important industrial 

10 activity in the search for new therapies and therapeutics. When new 
sequences have already been found to have counterparts in gene families 
or in non-human genes then knowledge about biological efficacy may 
already be apparant . For example, new receptors or receptor 
agonists/antagonists may exhibit differences to known instances of 

15 these molecules, and such differences could make them more suitable as 
therapeutics by, for example, exhibiting binding characteristics which 
are more in keeping with avoidance of toxicity. Reference can be made, 
for example, to polymorphic dopamine receptors and the implications for 
mental health (Iversen, L. Nature 358, pl09 (1992), and Van Tol, H. M. 

20 M. et al., Nature 358, pl49-152 (1992)). Where absolutely required, 
realisation of full length cDNAs for expression can be achieved by 
using the sequences to screen (by hybridisation) suitable cDNA 
libraries containing full length clones (D'Alession, J. M. , et al., 
Focus (Gibco B.R*L) 9 pi (1987)). Alternatively, the sequences can be 

25 used to design primers suitable for obtaining the missing sequences by 
PGR or other amplification methods (Frohman, M. A., Dush, M, K. and 
Martin, G. R. , Proc. Natl. Acad. Sci. 85 p8998-9002 (1988)). 

Appropriate use of the sequence fragments in antisense or triple helix 
30 (Griffin et al.. Science 245 p967-971 (1989)) applications will be 
useful for identifying manipulable targets related to disease. For 
example, viruses have been inhibited by antisense RNA to their mRNAs 
(Chang, L-J. , and Stoltfuz, C. M. J. Virol. p921-974 (1987)). A 
similar effect could be achieved by targetting the expression of 
35 cellular proteins which are essential for growth or maintenance of the 
virus. 



Partial or full length cDNAs have great utility once expressed. The 
manner of expression can be selected by one skilled in the art to suit 
40 the tended application. Expression of full length cDNAs is typically 
req. ed for biological activity. Procaryotic, and lower or higher 
eucaryotic hosts may be selected as the host for expression and higher 



PCT/GB93/0146- 

WO 94/01548 

11 

eucaryores may be preferred to ensure correct modifications, for 
example, glycosylation xn vivo, when this proves to be important^ 
..pression can be ensured by situating the cDNA -PP-^";-^; " 
Signals for expression (Amann, E. and Brosius, a. Gene 40 pi 3 198S), 
5 Shimuzu, Y et al.. Gene 65, pl41 (1988), Straus, D. and Gilbert W. 
Proc Natl. Acad. Sci- 82, p2014 (1985))- Such signals may include a 
promoter for transcription, which may itself be regulatable. 

The proteins thus-expressed can be screened for activities of 
therapeutic or commercial value. It. may be that the proteins have to 
be first isolated for this purpose or can be assayed in situ. It may 
be desirable that some means of stabilising the expressed protein is 
employed. This can be achieved, for example, (and as indicated 
earlier) by expressing in frame as part of a fusion polypeptide (Smith, 
D. B., et al., Proc. Natl. Acad. Sci. 83 p8073 (1986)). 



10 



15 



20 



25 



30 



35 



40 



useful antibodies can be raised against the expressed proteins. It is 
commonly not an absolute requirement that full length proteins are 
produced, although this may influence the quality of the antibodies 
produced. Peptides as short as 8 or 9 amino-acids in length can be 
used as antigens (Germain R. , N. Nature 353 pp605-607 (1991), Rudensky, 
A y., et al., Nature 353 p622-627 (1991)). Immunogenic peptides 
could 'simply be synthesised using the amino-acid sequence translated 
from a sequence or fragment of this invention. It is desirable, 
although not absolutely required, that some means of producing purified 
antibodies is adopted. When fusion polypeptides are used to raise 
antibodies, an affinity matrix specific for the generic part of the 
protein allows the fusion polypeptide to be immobilised (Smith, D. B., 
et al., Proc. Natl. Acad. Sci. 83 p8073 (1986)). The immobilised 
polypeptide can then be used to affinity purify the antibodies. 
Antibodies to both the generic part of the fusion polypeptide and the 
part of interest are produced. When these need to be discriminated 
between, a different affinity column can be used to remove only those 
antibodies specific for the generic part of the polypeptide. 
Alternatively, and as mentioned earlier, it can be arranged that the 
boundary between the two separate protein components of the fusio, 
polypeptide has the recognition sequence for an endopeptidase with a 
rare cutting site. The peptide of interest can then be released from 
the affinity purified polypeptide by the action of the endopeptidase 
(Nagai, K. , and Thogersen, H. , C. Methods Enzmol. 153 p461-481 (1987). 
Another alternative is raise monoclonal antibodies against the purified 
protein . 



15 



PCT/GB93/0146- 

WO 94/01548 

12 

T.e" antibodies can be used for localising in situ, or ^^^^J^ 
samples through, for example, ELISA or RIA assays, peptides aga.ns 
Which they were raised. These uses are particularly benefxc.al when 
the results of the assays can be correlated to a disease cond.txon, eg 
5- cancer. For example tumour markers may be found and used ^° 

therapeutic agents. The antibodies can also be used " - 
monitor markers of undifferentiated growth, infection, cardiovascular 
or immune disease or a therapeutic response. When the ant.bodxes 
recognise cell surface proteins they can be used in isolation or xn 
10 combination to isolate particular populations of cells. These .n turn 
can be used to isolate. yet more cDNAs which will be enriched for yet 
more of such surface markers for the population, which, xf sxmilarly 
screened, will permit yet further subdivision of the population. 
Ultimately, panels of antibodies which can describe particular disease 
states will accrue. Such antibodies could be tailored for forensic 
applications as well as diagnostic purposes and disease monitoring. 

The sequences or fragments can also be used for genetic analysis and 
mapping, for example, to diagnose the likelihood that a given 
20 individual is predisposed towards a given genetic disease. In the 
event of a sequence co-locating, genetically, with a disease gene, it 
can be used for the derivation of new disease therapies bases upon 
precise genetic knowledge. Such therapies can include, for example, 
the techniques of so-called "gene therapy" (Dusty Miller, A. Nature 357 
25 p455-460 (1992)). 

Antibodies can be produced against the protein of a genetic disease 
with sufficient discriminating power to discriminate between diseased 
and non-diseased states (Caskey, T. Genome Sequencing Conference, 
30 Hilton Head, S. Carolina (1991)). This would be useful for reducing 
the dependence of such tests on nucleic acid-based screens. Such 
antibodies also have the advantage of allowing detection of faulty 
expression of the protein, for example levels of expression which mky 
be important for development of the disease in slow onset conditions. 

Also very important is that not all cDNAs are likely to be found by 
conventional means, whereas the present sequences are, in one sense, 
-comprehensive". The use of the class of cDNAs which corresponds ct 
necessity to truncated clones increases the chances that part of a cd:,a 
40 will be cloned free of any sequences that could otherwise compromise it 
from being cloned. Sequence obtained can then be used to generate PGR 
primers 'from which the remainder can be obtained without having to 



WO 94/01548 PCr/GB93/0146" 



13 

clone . 



This invention will now be further described and illustrated by means 
of the following Examples. 

5 

All oligonucleotides used in these Examples were synthesised Trityl on 
using an ABI 380B DNA Synthesizer according to the manufacturers 
instructions. Purification was by reverse phase HPLC (see, for 
example, Becker, C, , R. , et al . , J. Chromatography 326, p293-299 
10 (1985)). 



Example 1 



Human brain and adrenal tissues were obtained from a mixture of 12 to 

15 15 week menstrual age foetuses and then snap frozen in liquid nitrogen 
before storing in bijou bottles in a -80'C freezer. The two types of 
tissue were used separately, directly from the freezer, to prepare cDNA 
from which restriction fragments were generated for sorting ^ nto 
subsets- Ig portions of each of the separate tissues were homogenised, 

20 using an Ultra-Turrax T25 Disperser (Janke and Kunkel, IKA- 
Labortechnik) / on ice in the presence of 4M guanidinium isothiocyanate 
to solubilise macromolecules . RNA was isolated ffLom each homogenate by 
using centrif ugation to sediment it through caesium trif luoroacetate . 
This was performed using the Pharmacia kit according to the 

25 manufacturer's instructions, except that centrif ugation was performed 
for 36 hours and the RNA obtained was finally desalted and concentrated 
by performing two ethanol precipitations in succession with two 70% 
ethanol washes after each precipitation. In each case, polyA' (mRNA) 
was isolated from 200 to 400 prg of the total RNA by binding it to 

30 magnetic oligo-dT coated beads (Dynal). Solution containing unbound 
material was removed from the beads, which were washed, and then mRNA 
eluted directly for use. mRNA isolation was performed in accordance 
with the manufacturer's instructions. Yields of RNA from the beads 
were between 1 and 3% of the total RNA. 2 to 4 ^9 of the eluted RNA 

35 were used for cDNA synthesis. cDNA synthesis was performed according 
to the method of Gubler, U and Hoffman, (B. J. Gene 25 p263 (1983) 
using a Pharmacia kit according to the manufacturer's instructions. 
OligodT was used to prime the first strand cDNA synthesis reaction. 
The cDNA was purified by extracting twice with phenol/chloroform and 

40 then low molecular weight solutes including nucleic acids below ca. 300 
bases were removed by passing the cDNA reaction mixture through a 
Pharmacia S400 spun column used according to the manufacturer's 



wo 94/01548 



PCr/GB93/01467 



14 

instructions. Running buffer for the column comprised 10 mM TrisHCl, 
1 mM EDTA, 50 mM NaCl Q pH 7,5. 

The column eluate was adjusted to 10 mM Mg=* and then the purified cDNA 
5 was restricted by the action of 1 unit per 10 fjl of the endonuclease 
Fok I at 37 *C for 1 hour, so that it would be able to accept adaptors 
to enable fragment sorting. 

The cDNA fragments were purified by two successive phenol/chloroform 
10 extractions followed by passing them through S400 spun columns as 
described above. 

The adaptors used were oligonucleotides 5* N4N4N,N,TCCTTCTCCTGCGACAGACA 
(SEQ ID: 1194) with the complementary strand 5' TGTCTGTCGCAGGAGAAGGA 

15 (SEQ ID: 1195) and 5* AAN4N4TCTCGGACAGTGCTCCGAGAAC (SEQ ID: 1196) or 5* 
TTN,N,TCTCGGACAGTGCTCCGAGAAC (SEQ ID: 1197) each with the complementary 
5' biotinylated strand GTTCTCGGAGCACTGTCCGAGA (SEQ ID: 1198). These 
were added to 25% of the eluted material by incubating together 200 
pmoles of the mixture of double-stranded adaptors in the elution buffer 

-0 to which had been added MgCl, to lOmM, ATP to lOmM and 0.025 units/^1 
of T4 DNA ligase. The oligonucleotide 5' biotinylated 

GTTCTCGGAGCACTGTCCGAGA, (SEQ ID: 1198) and whichever of the 
complementary oligonucleotides with which it was used, each comprised 
1/32 of the molar proportion total adaptors. The final reaction volume 

25 was 90 ^1 which was heated to 65**C for 3 minutes and then cooled to 
room temperature before the ligase was added. Ligation was performed 
for 16 hours at 12*'C. 



Two successive phenol/chloroform extractions were performed to remove 
the ligase. The final aqueous phase was passed ♦■hrough an S400 spun 
column (Pharmacia) as described above except tha\ :he column was used 
with 10 mM Tris pH 8.3/50 mM NaCl. 

The column eluate was adjusted to 25mM Mg2+, 0.5mM dNTPs in a final 
volume of 200 pi. The mixture was placed in a thermocycler (Techne 
MW2) and heated to 78°C for 5 minutes- At this point 10 units of 
cloned Taq DNA polymerase (AmpliTaq, Perkin Elmer) were added. This 
was followed by an incubation at 72®C for 10 .ninutes to fill in the 
unligated strand of the adaptor. After the second incubation 200 ^1 of 
streptavidin coated magnetic beads (Dynal) prepared according to the 
manufacturers instructions were added to bind cDNA ligated to that of 
:he oligonucleotides which was complementary to the 5* 



wo 94/01548 



PCT/GB93/0146T 



15 



GTTCTCGGAGCACTGTCCG^GA (SEQ ID: 1198) bict.nylated adaptorl Bead 
binding was allowed to proceed at 28«C for 30 minutes with mix.ng every 



10 minutes. 



Un-biotinylated cDNAs were washed from the beads with 400pl each of 2M 
Nacl twice, fresh 0.15 mM NaOH four times at 28»C for 5 minutes each, 
water twice and finally a buffer comprising 20 mM Tris pH 8.3, 50 mM 
NaCl, and 25mM Mg- . The beads were then resuspended in 240 ^1 of the 
final buffer including additionally 0.5 mM dNTPs and divided into 4x60 



10 pi. 



Four of the 60 pi aliquots, two from each tissue, were processed 
further specifically to prime and copy a subset of the immobil.sed, 
adaptored fragments. 2 pmoles of the primer 5' CTGTCTGTCGCAGGAGAAGGAA 

15 (SEQ ID: 1201) were added to each of two aliquots, one from each 
tissue. 2 pmoles of the primer 5' CTGTCTGTCGCAGGAGAAGGAG (SEQ ID: 
1202) were added to each of the other two aliquots. 2.5 unxts of Taq 
DNA polymerase were added to each reaction and 16 cycles of alternate 
denaturation at 95«C for 30 seconds, anrealing at es-C for 2 minutes 

20 and polymerisation at 72 °C for 3 minutes was performed to accumulate 
the selected single-strands in solution. 

on completion of the DNA synthesis reactions a further 30 ^1 of 
resuspended beads were added to each reaction to remove the 

25 biotinylated fragments. The reaction was incubated at 28«C for 30 
minutes mixing every 10 minutes to ensure that the biotinylated strands 
were bead bound. Each aqueous phase containing the newly synthesised 
strands was then removed and extracted with phenol/chloroform twice to 
remove the enzyme before being further purified by passing through an 

30 S400 spun column equilibrated with 10 mM Tris pH 8.3/50 mM NaCl , as 
described above. 

Rounds of PGR amplification of subsets of the selected fragments were 
performed by using the original primer in each case, together wrth one 

35 of the primers 5' GTTCTCGGAGCACTGTCCGAGAG (SEQ ID: 1199) or 

GTTCTCGGAGCACTGTCCGAGAC SEQ ID: 1200). This simultaneously rendered 
the fragments double-stranded and increased the amounts of available 
material. It was not known how many cycles of amplification would be 
required at this stage, since each prxmer pair would be expected to 

40 behave differently. It was therefore necessary directly to determine 
a suitable number empirically by using standard agarose ge- 
electrophoresis to examine the reaction products after a given number 



wo 94/01548 



PCT/GB9 '/0146" 



of cycles. In some cases, to avoid the accumulati \ of non-specific 
products, it was necessary to perform an initial 5 cycles of 
amplification with both of the primers present at 2 pmoles each. All 
reactions were performed using 8 pi or 12,5 % whichever was the larger 
5 but not exceeding 12 ^l of the column effluent above. Reaction 
conditions were adjusted to 20 mM Tris pH 8.3, 50 mM NaCl, 25mM Mg**, 
O.SmM dNTPs and 2.5 units of Tag DNA polymerase in a final volume of 40 
fjl. Apart from when an initial amplification with 2 pmoles of each 
primer was performed, 20 pmoles of each primer were used. Cycles of 
10 amplification were performed at 95°C for 30 seconds, 65*0 for 1 minute 
and 72**C for 3 minutes. 



For the purposes of cloning, selected cDNA was amplified as described 
immediately above, except that the reaction was not monitored. 
15 Instead, the number of cycles which had previously been shown to just 
give rise to' all observable products plus another 4 cycles were 
performed. In addition, an extra 72 °C for 10 minutes incubation was 
performed after the last cycle. 

20 The products of the reaction were then prepared for directional 
cloning. Water was added to adjust the final reaction volume to 60 fjl. 
Enzyme was removed by two successive phenol/chloroform extractions. 
The final aqueous mixture was passed through an S400 column as 
described above, except that it had been equilibrated with 10 mM Tris 

25 HCl pH 7.5, SOmM NaCl. 

For directional cloning, advntage was taken of the different known 
sequences introduced at each end of the selected cDNAs by the adaptors 
in a modification of the method of Aslandis, C, and de Jong, P. J. 

30 (Nucl. Acids Res. 18, p6156 (1990)). Different cohesive ends were 
produced on each end by using the exonuclease activity of T4 DNA 
polymerase to resect from the 3* end, to the first T in each case.. To 
75 pi or 75 % of the column eluate, whichever was least, were added 
9.5 pi of lOOmM TrisHCl pH7.4, 100 mM MgC12, and 9.5 pi of 0.5 mM dTTP. 

35 16 units of T4 DNA polymerase were added and the reaction incubated in 
a water bath at 37 for 30 minutes. The enzyme was removed by 
extracting with phenol/chloroform, twice successively. The salt of the 
final aqueous phase was adjusted by passing it through an S300 . .lumn 
(Pharmacia) equilibrated with 10 mM TrisHCl pH 7,4^ 1 mM EDTA as 

40 described above. 



The E.coli plasmid cloning vector pBluescript KS+ ( Alting-Meese , M. A. 



wo 94/01548 PCr/GB93/0146- 



17 

and Short J. M., Nucl . Acids Res. 17 p9494) was prepared for accepting 
the resected cDNA by restriction cleavage at the BamHI and Hindlll 
sites and then adaptoring the resultant cohesive ends using the 
specific adaptors produced by the oligonucleotide 5' AGCTCGGCTCGAGTCTG 
5 (SEQ ID: 1203) with its partially complementary oligonucleotide 5' 
GCGACAGACAGCAGACTCGAGCCG (SEQ ID: 1204) and the oligonucleotide 5» 
GATCCGGCTCGAGT (SEQ ID: 1205) with its partially complementary 
oligonucleotide 5^ CCGAGAACACTCGAGCCG (SEQ ID: 1206). Preparation of 
the vector and adaptoring were performed according to standard 

10 procedures. Insertion of the cDHh was performed between the BamHI and 
Hindlll restriction sites. Recombinant vectors were transformed into 
the host XLl-Biue (Bullock, W. O. et al Biotechniques 5 p376-378 
(1987)) by the method of Hannahan, D. J. (Mol. Biol. 166 p577-580 
(1983)). Suitable standard controls for the ligations and 

15 transformations were also included. 

Post transformation procedures were as described in "Molecular 
Cloning", 2nd Edition (Sambrook J., Fritsch, E. F, , and Maniatis, T. 
CSH Press (1989)), Colonies were produced by plating onto X-gal/IPTG 

20 L-agar plates containing 50^9/^1 ampicillin and lO^g/ml tetracyclin. 
Clear colonies were picked, each into a separate well of a microtitre 
plate, containing lOOpl of L-broth and SO^ig/ml ampicillin. Growth was 
allowed to occur for 16 hours at 37°C. 100^1 of 50% or 30% glycerol 
was added to plates which were archived at -20°C or -80°C, 

25 respectively - 

Bacteria corresponding to those archived were used for preparing 
templates for sequencing by the dideoxy method (Sanger, F- Milklen, S. 
and Coulson, A. R. Proc . Natl. Acad. Sci. 74 p5463-5467 (1977)). 

30 Bacteria for this purpose were either grown on L-agar plates containing 
50pg/ml of ampicillin, prepared at the same time as they had been g^rown 
in liquid culture, or after plating out from the archive. 
Alternatively, fresh liquid cultures were inoculated from the archive. 
In all cases, cDNA inserts were amplified for sequencing by PCR 

35 (Saiki, R- K. et al Science 239 p487-491 (1988)). PCR was either 
performed using bacteria directly added to the reaction, by a 
toothpick, or PCR was performed using l/50th of the plasmid isolated by 
preparative methods (Holmes, D. S. and Quigley, M. Anal. Biochem. 114 
pl93 (1981)) from the bacteria in the liquid cultures or from the 

40 plates. 



20 pmoles of each of the PCR primers 5' biotinylated GTAAAACGACGGCCAGT 



wo 94/01548 



PCr/GB93/01467 



18 

(SEQ ID: 1207) and 5' CGAGGTCGACGGTATCG (SEQ ID: 1208) were used in 
40^1 reactions containing 2.5mM Mg*", 50 mM KCl, Tris-HCl pH 8.3 and 
0.25 units of Amplitaq (C6tus). Reactions were performed at 95**C, for 
1 minute, followed by 35 cycles at 95°C for 30 seconds, 60**C for 30 
5 seconds and 72*'C for 40 seconds. After the cycles, a final incubation 
at 72°C for 5 minutes was performed. 

After PGR, standard agarose gel electrophoresis was used to determine 
which reactions had been successful. The biotinylated strands of 

10 successful reactions were then recovered for single-stranded secjuencing 
by binding them to steptravidin coated beads (Dynal) and then washing, 
all according to the manufacturers instructir ns , except that the 
washing steps were either performed manually or p. rformed automatically 
in the 96 well microtitre plate format using a Biomek robotic work- 

15 station attached to a side-arm loader (Beckman). 

Dideoxy chain termination sequencing reactions were performed using the 
immobilised, biotinylated strands as templates and 2 pmoles of the 
oligonucleotide 5' CGAGGTCGACGGTATCG (SEQ ID: 1209) as primer. 

20 Reactions were performed using f luorescently-labelled terminators (Du 
Pont) or a f luoroscein-labelled primer (Pharmacia) according to the 
manufacturers instructions. Reactions were analysed using automated 
DNA sequencers. A Genesis 2000 was used for the "Du Pont" reactions 
and an A.L.F. for the "Pharmacia" reactions. Bases were assigned for 

25 the Genesis 2000 reads using the manufacturers Base Caller software. 
Files of called bases were then transferred to a SUN Network from an 
Apple Macin- ^sh computer which had been used for base calling. Raw 
data from the A.L.F. reads was directly transferred to a SUN network 
where bases were called using the public domain "trace editor software" 

30 (TED). In both cases, files of called bases were entered into a 
Sybase"^^ database. Entering data entailed automatically removing vector 
and adaptor or linker sequences, but not editing ambiguous bases - 
After removal of the unwanted bases, files were automatically compared 
to other sequences in the cDNA database and the latest versions of the 

35 publically available databases, GENBANK and SWISSPROT. Searches were 
performed with the "basic local alignment search tool" (BLAST) (Karlin, 
S. and Altschul, S. F, Proc. Natl. Acad, Sci. 87 p2264-2268 (1990)). 

Sequences SEQ ID Nos 1 to 610, given hereinafter, wer* jbtained by the 
40 above procedure- 



WO 94/01548 



PCT/GB93/0146" 



19 

A second method of preparing cDNA libraries for obtaining gene 
fragments of the invention took advantage of the PCR normalisation 
process described above. Standard procedures were used to prepare mRNA 
from RNA that had been isolated by standard caesium chloride bouyant 
5 density gradient methods from a full term human placenta. The 
oligonucleotide LNotdt, sequence 5' TACGTTCGACAAGCTTGAATTCGCGGCCGC ( T ) 
(SEQ ID: 1210) was used at 1 fjH with AMV reverse transcriptase, to 
prime first strand cDNA synthesis under standard coniiitions from 0 . 5 /jg 
of the placental mRNA. Temperatures above 65 *C were used to inactivate 
10 the reverse transcriptase and then the volume of the reaction made up 
to 100 ^1 with water. 

PCRs were then performed in reactions containing 1 pi of the diluted 
CDNA, 10 mM Tris-HCl pH 8.3, 40 mM KCl, 1.5 mM MgCl., 0.01% gelatin, 200 

15 dNTPs, 10 uCi a^-P dCTP, 1 ^H each of the primers llADl, sequence 5' 

GCC(TA) (GC)CGCCGA (SEQ ID: 1211), and LNotdT and Taq DNA polymerase. 
An initial denaturation period of 95 *c for 90 seconds was followed 
either by 35 cycles of standard PCR, comprising 95 'C for 30 seconds, 
45 *C for 30 seconds and 72 'C for 30 seconds or alternatively 3 cycles 

20 of the standard PCR already described followed by 27 cycles of Cot PCR 
during which an additional step of 72 *C for 16 minutes was placed 
between all of the 95 *C and 45 *C steps of the standard PCR, The 
standard PCR was followed by a single 72 'C for 3 minutes step while the 
Cot PCR was followed by one standard PCR cycle except that the 72 'C 

25 incubation was performed for 3 minutes. 

Products of the PCR reaction were end repaired by adding 5 units of T4 
DNA polymerase to the reaction and then incubating at 37 *C for 10 
minutes. Enzymes were removed by phenol extraction. The cDNA was 

30 precipitated by 70% ethanol, dried and then resuspended in NotI buffer. 
20 units of NotI were used to digest the cDNA under standard 
conditions. cDNA was again phenol extracted and ethanol precipitated. 
10% of the purified NotI cut DNA were ligated to the vector pBluescript 
Iting-Meese, M. A. and Short J. M. Nucl. Acid Res. 17 p9494 which had 

35 been prepared as standard to receive this DNA by restricting with the 
enzymes NotI and EcoRV. Transformation and processing of clear 
colonies was performed as described above except that the host E- coli 
strain DH5a was used in place of XL-1 Blue. 



40 



Preparation of clones for sequencing, sequencing and sequence analysis 
of cDNAs in clones thus-produced were performed as described in Example 
1- 



wo 94/01548 



PCT/GB93/01467 



20 



20 



25 



30 



sequences SEQ ID Ncs 611 to 772 , gWen here.nafrer, were obtained by 
the above procedure. 



Example 3 



cDNA libraries corresponding to adult brain cortex (Clontech 
Laboratories, Inc., Cat No. HL10036) and adult bone marrow (Clontech 
Laboratories, Inc., Cat No. HL10586) prepared in lambda gtll phage were 
transfected into E.coli Y1090 and plated out for colour selection of 
recombinant plaques ("Molecular Cloning", 2nd Edition Sambrook J., 
Fritsch, E. F., and i^aniatis, T. CSH Press (1989)). 192 lambda Zap 
clones, corresponding to rhabdomyocarcoma cDNAs and a gift from C. 
cooper, ICR. Sutton, were similarly plated except that the host XL-1 
Blue was used. 

Clear plaaues from each library were resuspended in 5 pi of Tris-HCi pH 
8, 1 mM EDTA. 2 ^Jl of the r :ltant phage suspensions were added 
directly to PCRs for the purpos. of amplifying the cDNA inserts for 
sequencing. PCR Was performed as described in Example 1, except that 
the oligonucleotides used as primers for the lambda gtll clones were 5- 
GGTGGCGACGACTCCTGGAGCCCG (SEQ ID: 1212) and 5' TTGACACCAGACCAACTGGTAATG 
(SEQ ID: 1213). Whichever of the oligonucleotides was to be used to 
prime the strand which would serve as the sequencing template was used 
in biotinylated form. 

Preparation of clones for sequencing, sequencing and sequence analysis 
of CDNAS in clones thus-produced was performed as described in Example 
1, except that 2 pmoles of the primers that were unbiotinylated in the 
PCR were used as sequencing primers. 

sequences SEQ ID Nos 773 to 1193, given hereinafter, were obtained by 
the above procedure . 

The following are the SEQUENCE LISTINGS which comprise sequences SEQ ID 
35 NOS 1 to 1213 referred to hereinbefore. Certain of these sequences are 
preferred, and are listed as such after the main SEQUENCE LISTINGS. 



wo 94/01548 



PCT/GB93/0146" 



256' 



10 



15 



20 



25 



CCGTGGGTC 



(2) INFORMATION FOR SEQ ID :493: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 197 base pairs 

( B ) TYPE : nucleic acid 

(C) STRANOEDNESS: double 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID :493: 
GGGCAGAGNA AGAACTGTTC CACCAGGTGA ACAGTCCTAC CTGCTTGGTA 
CCATAGTCCC TCAATAAGAT TCAGAGGAAG AAGCTTATGA AACTGAAAAT 
CAAATCAAGG TATCGGGAAG AATAATTTCC CCTCGATTCC ACAGGAGGGA 
AGACCACACA ATATGTNGTG CTGGGGCTCC CCAAGGCCCT GCCACCT 
(2) INFORMATION FOR SEQ ID :494: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 188 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



159 



50 
100 
150 
197 



30 



35 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID :494: 
GATGGGGAAG GGCATCCCAA CACAGCCTGT GGATCCTGGG GCATCTGGAA 
GGGCGCACCA TCAGCAGCCT CACCAGCTGT GAGCCTGCTA TCGGGCCTGC 
CCCTCCAATA AAAGTGTGAG AACTCCACTG TGTGCCCTGT CTTTGGGCAG 
GGAGGGCTGC TGTGAGTGGA GTCTTGAGTN GGTGAGTG 



50 
100 
150 
188 



wo 94/01548 PCr/GB93/0l467 

613 

CLAIMS: 

1. A nucleic acid fragment encoding a gene product or portion 
thereof and comprising any one of : - 

5 

(a) a sequence selected from SEQ ID Nos 1 to 1193; 

(b) an allelic variation of a sequence as defined in (a); or 
10 (c) a sequence complementary to (a) or (b) . 

2. A nucleic acid sequence as set out in any one of SEQ ID Nos 1 to 
1193, or a complement or allelic variation thereof. 

;5 3. A sequence as claimed in claim 2 and which exhibits no more than 
90% homology to a human sequence known per se. 

4. A nucleic -cid fragment comprising a portion of a sequence as 
defined in claim 2 or claim 3 of sufficient size such that a probe of 

20 the same size and exhibiting complementarity to said portion can 
hybridize to said sequence as defined in claim 2 or claim 3. 

5. A fragment as claimed in claim 4, wherein said portion is at 
least 15 bases in length. 



25 



6. A fragment as claimed in any one of claims 1, 4 or 5 and encoding 
at least a portion of a biologically active polypeptide. 



7, A nucleic acid sequence as claimed in claim 2 or claim 3 and 
30 encoding at least a portion of a biologically active polypeptide. 

8, A DNA construct comprising a fragment as defined in any one of 
claims 1, 4, 5 or 6 or a sequence as defined in any one of claims 2, 3 
or 7, together with a control or regulatory sequence, 

35 

9, A construct as claimed in claim 8 which encodes a fusion protein 
comprising a known protein and the polypeptide encoded by said fragment 
or sequence. 

40 10. A construct as claimed in claim 9, wherein the fusion protein 
encoded is a cleavable fusion protein having an endopept idase 
recognition site positioned between codons corresponding to said known 



wo 94/01548 PCT/GB93/01467 

614 

protein and said f ragmen:: or sequence. 

11. The use of a fragment as defined in any one of claims 1, 4, 5 or 
6 or a sequence as defined in any one of claims 2, 3 or 7 to produce a 

5 gene. 

12. A DNA fragment comprising a gene obtainable by the use defined in 
claim 11. 

^0 An expression vector comprising a fragment as defined in any one 

of claims 1, 3, 5 or 6, a sequence as defined in any one of claims 2, 
3 or 7, a DNA construct as defined in any one of claims 8 to 10, or a 
DNA fragment as claimed in claim 12, positioned such that that nucleic 
acid sequence which encodes the polypeptide corresponding to said 
fragment, sequence or DNA fragment is in operable reading frame with a 
control or regulatory sequence. 

14. A vector as claimec i claim 13, wherein said vector control or 
regulatory sequence comprises a regulatabie promoter. 

15. Host cells which incorporate as a heterologous part of their 
expressible genetic information a fragment as defined in any one of 
claims 1, 3, 5 or 6, a sequence as defined in any one of claims 2, 3 or 
1, or a DNA fragment as defined in claim 12. 

16. A process for the production of a polypeptide comprising 
cultivating host cells as defined in claim 15. 

17. An antibody directed against a polypeptide obtainable by the 
performance of a process as defined in claim 16. 

18. An antibody as claime . in claim 17 and which is monoclonal. 

19. A novel gene product or portion thereof encoded by a fragment as 
defined in any one of claims 1, 3, 5 or 6, or encoded by a sequence as 
defined in any one of claims 2, 3 or 7 , or encoded by the gene 
comprised in a DNA fragment as defined in claim 12. 



15 



20 



25 



30 



35 



40 



